A MARKOV DECISION PROCESS WITH NON-STATIONARY TRANSITION LAWS

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic Distance for a General Non-Stationary Markov Substitution Process

The genetic distance between biological sequences is a fundamental quantity in molecular evolution. It pertains to questions of rates of evolution, existence of a molecular clock, and phylogenetic inference. Under the class of continuous-time substitution models, the distance is commonly defined as the expected number of substitutions at any site in the sequence. We eschew the almost ubiquitous...

متن کامل

Linear programming formulation for non-stationary, finite-horizon Markov decision process models

Linear programming (LP) formulations are often employed to solve stationary, infinitehorizon Markov decision process (MDP) models. We present an LP approach to solving nonstationary, finite-horizon MDP models that can potentially overcome the computational challenges of standard MDP solution procedures. Specifically, we establish the existence of an LP formulation for risk-neutral MDP models wh...

متن کامل

Learning in non-stationary Partially Observable Markov Decision Processes

We study the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not perfectly known and may change over time. We present the algorithm MEDUSA+, which incrementally improves a POMDP model using selected queries, while still optimizing the reward. Empirical results show the response of the algorithm to changes in the parameters of a m...

متن کامل

A generalized Markov decision process

— In this paper we present a generalized Markov décision process that subsumes the traditional discounted, infinité horizon, finite state and action Markov décision process, VeinotCs discountéd décision processes, and Koehler's generalization of these two problem classes. Résumé. — Nous présentons dans cet article un processus de Markov généralisé qui englobe le processus de décision markovien ...

متن کامل

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

We consider infinite-horizon stationary γ-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error ǫ at each iteration, it is well-known that one can compute stationary policies that are 2γ (1−γ)2 ǫ-optimal. After arguing that this guarantee is tight, we develop variations of Value and Policy Iter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bulletin of Mathematical Statistics

سال: 1968

ISSN: 0007-4993

DOI: 10.5109/13030